Skip to content

Add ContinueAsNew fresh-trace support for periodic orchestrations#1337

Open
chandramouleswaran wants to merge 2 commits intoAzure:mainfrom
chandramouleswaran:feature/continue-as-new-fresh-trace
Open

Add ContinueAsNew fresh-trace support for periodic orchestrations#1337
chandramouleswaran wants to merge 2 commits intoAzure:mainfrom
chandramouleswaran:feature/continue-as-new-fresh-trace

Conversation

@chandramouleswaran
Copy link
Copy Markdown

Summary

Long-running periodic orchestrations that use ContinueAsNew accumulate all generations into a single distributed trace. This adds an opt-in ContinueAsNewTraceBehavior.StartNewTrace option that starts the next generation in a fresh trace.

Motivation

Orchestrations that run on a schedule (e.g., every 5 hours) via ContinueAsNew end up with a single trace spanning days/weeks/months, making:

  • Individual cycle performance hard to measure
  • Trace viewers slow/unresponsive with thousands of spans
  • Anomaly detection across cycles impossible

API Changes

  • ContinueAsNewOptions class with TraceBehavior property
  • ContinueAsNewTraceBehavior enum: PreserveTraceContext (default) | StartNewTrace
  • New ContinueAsNew(string, object, ContinueAsNewOptions) overload on OrchestrationContext
// Start a fresh trace for the next generation
context.ContinueAsNew(null, input, new ContinueAsNewOptions
{
    TraceBehavior = ContinueAsNewTraceBehavior.StartNewTrace,
});

Implementation

Uses a typed bool GenerateNewTrace property on ExecutionStartedEvent (not tags) to signal fresh-trace behavior. The property is consumed once by TraceHelper (creates a fresh root producer span, stores identity in ParentTraceContext, resets to false). Subsequent replays use the persisted identity — stable span ID across replays.

Signal flow

  1. Orchestrator calls ContinueAsNew(version, input, options) → sets ContinueAsNewTraceBehavior on OrchestrationCompleteOrchestratorAction
  2. Dispatcher creates the next ExecutionStartedEvent with GenerateNewTrace = true and skips copying the old ParentTraceContext
  3. TraceHelper sees GenerateNewTrace, creates a fresh root producer span, stores its identity in ParentTraceContext, and resets GenerateNewTrace = false
  4. Subsequent replays use the persisted identity — stable span ID across replays

Design decisions

Decision Rationale
Typed bool property on ExecutionStartedEvent instead of tags Avoids customer tag namespace collision; avoids cross-generation tag leaking through CloneTags; typed and self-documenting
Tags are now cloned (not shared by reference) Prevents mutation of the current generation's tag dictionary during continuation
Base class throws NotSupportedException External OrchestrationContext implementations cannot silently ignore StartNewTrace
Single 3-param overload (version + input + options) The 2-param overload (object input, ContinueAsNewOptions options) was ambiguous with (string newVersion, object input); users pass null for version to keep current
Legacy Correlation pipeline unchanged The two trace systems (System.Diagnostics.Activity vs legacy Correlation) are independent; only ParentTraceContext controls the new pipeline

Replay safety

  • GenerateNewTrace is consumed once and reset to false
  • The result (trace identity in ParentTraceContext.Id/.SpanId) is persisted
  • Subsequent replays restore from persisted identity, not the signal
  • On crash before first persist, a new trace is created (consistent since no state from the abandoned attempt survived)

Backward Compatibility

  • Default behavior is PreserveTraceContext — zero change for existing users
  • GenerateNewTrace defaults to false — pre-upgrade serialized events deserialize correctly
  • ContinueAsNewTraceBehavior defaults to PreserveTraceContext (0) on the action
  • Existing ContinueAsNew(object) and ContinueAsNew(string, object) are unchanged

Tests

31 tests passing covering:

  • GenerateNewTrace property: default value, copy constructor, JSON serialization, backward compat
  • Tag isolation: property doesn't appear in or leak through tags
  • TraceHelper: fresh trace creation, consume-and-reset, replay with persisted identity, ambient activity isolation
  • TaskOrchestrationContext: new overloads, last-call-wins, null options rejection
  • Base class NotSupportedException
  • ContinueAsNewOptions defaults

chandramouleswaran and others added 2 commits April 15, 2026 15:18
Long-running periodic orchestrations that use ContinueAsNew accumulate all
generations into a single distributed trace, making individual cycles hard
to observe. This adds an opt-in ContinueAsNewTraceBehavior.StartNewTrace
option that starts the next generation in a fresh trace.

API changes:
- Added ContinueAsNewOptions class with TraceBehavior property
- Added ContinueAsNewTraceBehavior enum (PreserveTraceContext, StartNewTrace)
- Added ContinueAsNew(string, object, ContinueAsNewOptions) overload on
  OrchestrationContext (virtual, throws NotSupportedException by default)
- TaskOrchestrationContext overrides it to set the behavior on the action

Implementation:
- Added GenerateNewTrace property on ExecutionStartedEvent (typed bool,
  [DataMember], defaults to false for backward compatibility)
- Dispatcher sets GenerateNewTrace=true on the continuation event when
  StartNewTrace is requested, and skips copying ParentTraceContext
- TraceHelper consumes the flag once, creates a fresh root producer span,
  stores the new identity in ParentTraceContext, and resets the flag
- Subsequent replays use the persisted identity (stable span across replays)

Design decisions:
- Used a typed property instead of tags to avoid customer namespace collision
  and tag-leak bugs through CloneTags
- Tags are now cloned (not shared by reference) to prevent mutation of the
  current generation's tag dictionary
- Base class throws NotSupportedException instead of silently dropping options
- Only one new overload (3-param with version) to avoid overload ambiguity
  between ContinueAsNew(object, ContinueAsNewOptions) and
  ContinueAsNew(string, object)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings April 16, 2026 07:12
@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines:
1 pipeline(s) require an authorized user to comment /azp run to run.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an opt-in mechanism for ContinueAsNew to start the next orchestration generation in a fresh distributed trace (instead of accumulating all generations into one long trace), improving observability for periodic/long-running orchestrations.

Changes:

  • Introduces ContinueAsNewOptions + ContinueAsNewTraceBehavior and a new OrchestrationContext.ContinueAsNew(string, object, ContinueAsNewOptions) overload.
  • Propagates a new ExecutionStartedEvent.GenerateNewTrace signal through continuation creation and consumes it in TraceHelper to create a fresh root trace and persist stable replay identity.
  • Clones continuation tags to avoid cross-generation mutation and adds test coverage for the new behavior.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
src/DurableTask.Core/Tracing/TraceHelper.cs Consumes GenerateNewTrace to create a fresh root trace for the next generation.
src/DurableTask.Core/TaskOrchestrationDispatcher.cs Sets GenerateNewTrace on continued-as-new ExecutionStartedEvent, skips parent context copy when starting a new trace, and clones tags.
src/DurableTask.Core/TaskOrchestrationContext.cs Adds the new overload and flows options.TraceBehavior into the completion action.
src/DurableTask.Core/OrchestrationContext.cs Adds the new virtual overload (default throws NotSupportedException).
src/DurableTask.Core/History/ExecutionStartedEvent.cs Adds the persisted GenerateNewTrace flag to history.
src/DurableTask.Core/ContinueAsNewOptions.cs New options type + enum controlling fresh-trace behavior.
src/DurableTask.Core/Command/OrchestrationCompleteOrchestratorAction.cs Adds ContinueAsNewTraceBehavior to the completion action for dispatcher consumption.
Test/DurableTask.Core.Tests/ContinueAsNewTraceBehaviorTests.cs Adds tests for serialization/back-compat, trace creation/consumption, overload behavior, and tag isolation.
Comments suppressed due to low confidence (1)

src/DurableTask.Core/Tracing/TraceHelper.cs:107

  • TraceHelper no longer honors the OrchestrationTags.CreateTraceForNewOrchestration ("MS_CreateTrace") tag. Since this tag is still a public constant, any existing callers that relied on setting it on ExecutionStartedEvent.Tags to force root trace creation will silently stop working. Consider keeping backward-compat by treating the tag as an alias for GenerateNewTrace (and optionally removing it from Tags after consumption), or explicitly obsoleting/removing the public tag constant as part of the change.
            // When GenerateNewTrace is set, create a fresh root trace for this orchestration.
            // The flag is consumed once and reset so that subsequent replays use the
            // persisted trace identity rather than creating yet another new trace.
            if (startEvent.GenerateNewTrace)
            {
                startEvent.GenerateNewTrace = false;
                // Note that if we create the trace activity for starting a new orchestration here, then its duration will be longer since its end time will be set to once we 
                // start processing the orchestration rather than when the request for a new orchestration is committed to storage. 
                using var activityForNewOrchestration = StartActivityForNewOrchestration(startEvent);
            }

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

public override void ContinueAsNew(object input)
{
ContinueAsNew(null, input);
ContinueAsNewCore(null, input, new ContinueAsNewOptions());
Copy link

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ContinueAsNew(object) now calls the private ContinueAsNewCore directly instead of delegating to the virtual ContinueAsNew overload(s). This changes extensibility/behavior for any derived TaskOrchestrationContext types that override ContinueAsNew(string, object) or ContinueAsNew(string, object, ContinueAsNewOptions) and previously expected ContinueAsNew(object) to route through those overrides. To preserve polymorphic behavior, consider delegating to the overload (e.g., ContinueAsNew(null, input, new ContinueAsNewOptions())) rather than calling the private core method here.

Suggested change
ContinueAsNewCore(null, input, new ContinueAsNewOptions());
this.ContinueAsNew(null, input, new ContinueAsNewOptions());

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants